Mitigating preventable readmissions, where patients are readmitted for the same primary diagnosis within 30 days, poses a significant challenge to the delivery of high-quality healthcare. Toward this end, we develop a novel, predictive analytics model, termed as the beta geometric Erlang-2 (BG/EG) hurdle model, which predicts the propensity, frequency, and timing of readmissions of patients diagnosed with congestive heart failure (CHF). This unified model enables us to answer three key questions related to the use of predictive analytics methods for patient readmissions: whether a readmission will occur, how often readmissions will occur, and when a readmission will occur. We test our model using a unique data set that tracks patient demographic, clinical, and administrative data across 67 hospitals in North Texas over a four-year period. We show that our model provides superior predictive performance compared to extant models such as the logit, BG/NBD hurdle, and EG hurdle models. Our model also allows us to study the association between hospital usage of health information technologies (IT) and readmission risk. We find that health IT usage, patient demographics, visit characteristics, payer type, and hospital characteristics, are significantly associated with patient readmission risk. We also observe that implementation of cardiology information systems is associated with a reduction in the propensity and frequency of future readmissions, whereas administrative IT systems are correlated with a lower frequency of future readmissions. Our results indicate that patient profiles derived from our model can serve as building blocks for a predictive analytics system to identify CHF patients with high readmission risk.
This paper presents and extends Latent Growth Modeling (LGM) as a complementary method for analyzing longitudinal data, modeling the process of change over time, testing time-centric hypotheses, and building longitudinal theories. We first describe the basic tenets of LGM and offer guidelines for applying LGM to Information Systems (IS) research, specifically how to pose research questions that focus on change over time and how to implement LGM models to test time-centric hypotheses. Second and more important, we theoretically extend LGM by proposing a <i>model validation</i> criterion, namely “<i>d</i>-<i>separation</i>,” to evaluate <i>why</i> and <i>when</i> LGM works and test its fundamental properties and assumptions. Our <i>d</i>-separation criterion does not rely on any distributional assumptions of the data; it is grounded in the fundamental assumption of the theory of conditional independence. Third, we conduct extensive simulations to examine a multitude of factors that affect LGM performance. Finally, as a practical application, we apply LGM to model the relationship between word-of-mouth communication (online product reviews) and book sales over time with longitudinal 26-week data from Amazon. The paper concludes by discussing the implications of LGM for helping IS researchers develop and test longitudinal theories.
The 50-year march of Moore’s Law has led to the creation of a relatively cheap and increasingly easy-to-use world-wide digital infrastructure of computers, mobile devices, broadband network connections, and advanced application platforms. This digital infrastructure has, in turn, accelerated the emergence of new technologies that enable transformations in how we live and work, how companies organize, and the structure of entire industries.
Reviews and product recommendations at online stores have enabled customers to readily evaluate alternative products prior to any purchase. In this context, firms generate recommendations to refer customers to a wider variety of products. They also display customer-generated online reviews to facilitate evaluation of those recommended products. This study integrates these two IT artifacts to investigate consumer choice vis-à-vis competing products. We use a dataset we collected from Amazon.com consisting of books, sales ranks, recommendations, reviews, and reviewers. We derive the granular impact of reviews, product referrals, and reviewer opinions on the dynamics of product sales within a competitive market using comprehensive econometric analyses.
Today, few firms could survive for very long without their computer systems. IT has permeated every corner of firms. Firms have reached the current state in their use of IT because IT has provided myriad opportunities for firms to improve performance and, firms have availed themselves of these opportunities. Some have argued, however, that the opportunities for firms to improve their performance through new uses of IT have been declining. Are the opportunities to use IT to improve firm performance diminishing? We sought to answer this question. In this study, we develop a theory and explain the logic behind our empirical analysis; an analysis that employs a different type of event study. Using the volatility of firms' stock prices to news signaling a change in economic conditions, we compare the stock price behavior of firms in the IT industry to firms in the utility and transportation and freight industries. Our analysis of the IT industry as a whole indicates that the opportunities for firms to use IT to improve their performance are not diminishing. However, there are sectors within the IT industry that no longer provide value-enhancing opportunities for firms. We also find that IT products that provided opportunities for firms to create value at one point in time, later become necessities for staying in business. Our results support the key assumption in our work.
Managers routinely seek to understand firm performance relative to the competitors. Recently, competitive intelligence (CI) has emerged as an important area within business intelligence (BI) where the emphasis is on understanding and measuring a firm's external competitive environment. A requirement of such systems is the availability of the rich data about a firm's competitors, which is typically hard to acquire. This paper proposes a method to incorporate competitive intelligence in BI systems by using less granular and aggregate data, which is usually easier to acquire. We motivate, develop, and validate an approach to infer key competitive measures about customer activities without requiring detailed cross-firm data. Instead, our method derives these competitive measures for online firms from simple "site-centric" data that are commonly available, augmented with aggregate data summaries that may be obtained from syndicated data providers. Based on data provided by comScore Networks, we show empirically that our method performs well in inferring several key diagnostic competitive measures-the penetration, market share, and the share of wallet-for various online retailers.
Because a fundamental attribute of a good theory is causality, the information systems (IS) literature has strived to infer causality from empirical data, typically seeking causal interpretations from longitudinal, experimental, and panel data that include time precedence. However, such data are not always obtainable and observational (cross-sectional, nonexperimental) data are often the only data available. To infer causality from observational data that are common in empirical IS research, this study develops a new data analysis method that integrates the Bayesian networks (BN) and structural equation modeling (SEM) literatures. Similar to SEM techniques (e.g., LISREL and PLS), the proposed Bayesian networks for latent variables (BN-LV) method tests both the measurement model and the structural model. The method operates in two stages: First, it inductively identifies the most likely LVs from measurement items without prespecifying a measurement model. Second, it compares all the possible structural models among the identified LVs in an exploratory (automated) fashion and it discovers the most likely causal structure. By exploring the causal structural model that is not restricted to linear relationships, BN-LV contributes to the empirical IS literature by overcoming three SEM limitations (Lee, B., A. Barua, A. B. Whinston. 1997. Discovery and representation of causal relationships in MIS research: A methodological framework. MIS Quart. 21(1) 109-136)-lack of causality inference, restrictive model structure, and lack of nonlinearities. Moreover, BN-LV extends the BN literature by (1) overcoming the problem of latent variable identification using observed (raw) measurement items as the only inputs, and (2) enabling the use of ordinal and discrete (Likert-type) data, which are commonly used in empirical IS studies. The BN-LV method is first illustrated and tested with actual empirical data to demonstrate how it can help reconcile competing hypotheses in terms of the direction of causality in a structural model. Second, we conduct a comprehensive simulation study to demonstrate the effectiveness of BN-LV compared to existing techniques in the SEM and BN literatures. The advantages of BN-LV in terms of measurement model construction and structural model discovery are discussed.
In this paper we pursue three main objectives: (1) to develop a model of an intermediated search market in which matching between consumers and firms takes place primarily via paid referrals; (2) to address the question of designing a suitable mechanism for selling referrals to firms; and (3) to characterize and analyze the firms' bidding strategies given consumers' equilibrium search behavior. To achieve these objectives we develop a two-stage model of search intermediaries in a vertically differentiated product market. In the first stage an intermediary chooses a search engine design that specifies to which extent a firm's search rank is determined by its bid and to which extent it is determined by the product offering's performance. In the second stage, based on the search engine design, competing firms place their open bids to be paid for each referral by the search engine. We find that the revenue-maximizing search engine design bases rankings on a weighted average of product performance and bid amount. Nonzero pure-strategy equilibria of the underlying discontinuous bidding game generally exist but are not robust with respect to noisy clicks in the system. We determine a unique nondegenerate mixed-strategy Nash equilibrium that is robust to noisy clicks, In this equilibrium firms of low product performance fully dissipate their rents, which are appropriated by the search intermediary and the firm with the better product. The firms' expected bid amounts are generally nonmonotonic in product performance and depend on the search engine design parameter. The intermediary's profit-maximizing design choice, by attributing a positive weight to the firms' bids, tends to obfuscate search results and reduce overall consumer surplus compared to the socially optimal design of fully transparent results ranked purely on product performance.
Due to the vast amount of user data tracked online, the use of data-based analytical methods is becoming increasingly common for e-businesses. Recently the term analytical eCRM has been used to refer to the use of such methods in the online world. A characteristic of most of the current approaches in eCRM is that they use data collected about users' activities at a single site only and, as we argue in this paper, this can present an incomplete picture of user activity. However, it is possible to obtain a complete picture of user activity from across-site data on users. Such data is expensive, but can be obtained by firms directly from their users or from market data vendors. A critical question is whether such data is worth obtaining, an issue that little prior research has addressed. In this paper, using a data mining approach, we present an empirical analysis of the modeling benefits that can be obtained by having complete information. Our results suggest that the magnitudes of gains that can be obtained from complete data range from a few percentage points to 50 percent, depending on the problem for which it is used and the performance metrics considered. Qualitatively we find that variables related to customer loyalty and browsing intensity are particularly important and these variables are difficult to derive from data collected at a single site. More importantly, we find that a firm has to collect a reasonably large amount of complete data before any benefits can be reaped and caution against acquiring too little data.